UMCC_DLSI_SemSim: Multilingual System for Measuring Semantic Textual Similarity

نویسندگان

Alexander Chavez

Héctor Dávila

Yoan Gutiérrez-Vázquez

Antonio Fernández Orquín

Andrés Montoyo

Rafael Muñoz

چکیده

In this paper we describe the specifications and results of UMCC_DLSI system, which was involved in Semeval-2014 addressing two subtasks of Semantic Textual Similarity (STS, Task 10, for English and Spanish), and one subtask of Cross-Level Semantic Similarity (Task 3). As a supervised system, it was provided by different types of lexical and semantic features to train a classifier which was used to decide the correct answers for distinct subtasks. These features were obtained applying the Hungarian algorithm over a semantic network to create semantic alignments among words. Regarding the Spanish subtask of Task 10 two runs were submitted, where our Run2 was the best ranked with a general correlation of 0.807. However, for English subtask our best run (Run1 of our 3 runs) reached 16 place of 38 of the official ranking, obtaining a general correlation of 0.682. In terms of Task 3, only addressing Paragraph to Sentence subtask, our best run (Run1 of 2 runs) obtained a correlation value of 0.760 reaching 3 place of 34.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ExB Themis: Extensive Feature Extraction from Word Alignments for Semantic Textual Similarity

We present ExB Themis – a word alignmentbased semantic textual similarity system developed for SemEval-2015 Task 2: Semantic Textual Similarity. It combines both string and semantic similarity measures as well as alignment features using Support Vector Regression. It occupies the first three places on Spanish data and additionally places second on English data. ExB Themis proved to be the best ...

متن کامل

Meerkat Mafia: Multilingual and Cross-Level Semantic Textual Similarity Systems

We describe UMBC’s systems developed for the SemEval 2014 tasks on Multilingual Semantic Textual Similarity (Task 10) and Cross-Level Semantic Similarity (Task 3). Our best submission in the Multilingual task ranked second in both English and Spanish subtasks using an unsupervised approach. Our best systems for Cross-Level task ranked second in Paragraph-Sentence and first in both Sentence-Phra...

متن کامل

ECNU at SemEval-2017 Task 1: Leverage Kernel-based Traditional NLP features and Neural Networks to Build a Universal Model for Multilingual and Cross-lingual Semantic Textual Similarity

To model semantic similarity for multilingual and cross-lingual sentence pairs, we first translate foreign languages into English, and then build an efficient monolingual English system with multiple NLP features. Our system is further supported by deep learning models and our best run achieves the mean Pearson correlation 73.16% in primary track.

متن کامل

Robust semantic text similarity using LSA, machine learning, and linguistic resources

Semantic textual similarity is a measure of the degree of semantic equivalence between two pieces of text. We describe the SemSim system and its performance in the *SEM 2013 and SemEval-2014 tasks on semantic textual similarity. At the core of our system lies a robust distributional word similarity component that combines Latent Semantic Analysis and machine learning augmented with data from se...

متن کامل

A Large-Scale Multilingual Disambiguation of Glosses

Linking concepts and named entities to knowledge bases has become a crucial Natural Language Understanding task. In this respect, recent works have shown the key advantage of exploiting textual definitions in various Natural Language Processing applications. However, to date there are no reliable large-scale corpora of sense-annotated textual definitions available to the research community. In ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

UMCC_DLSI_SemSim: Multilingual System for Measuring Semantic Textual Similarity

نویسندگان

چکیده

منابع مشابه

ExB Themis: Extensive Feature Extraction from Word Alignments for Semantic Textual Similarity

Meerkat Mafia: Multilingual and Cross-Level Semantic Textual Similarity Systems

ECNU at SemEval-2017 Task 1: Leverage Kernel-based Traditional NLP features and Neural Networks to Build a Universal Model for Multilingual and Cross-lingual Semantic Textual Similarity

Robust semantic text similarity using LSA, machine learning, and linguistic resources

A Large-Scale Multilingual Disambiguation of Glosses

عنوان ژورنال:

اشتراک گذاری